Proposing a novel molecular subtyping scheme for predicting distant recurrence

您所在的位置:网站首页 consensuscluster wgcna Proposing a novel molecular subtyping scheme for predicting distant recurrence

Proposing a novel molecular subtyping scheme for predicting distant recurrence

2024-05-04 21:26| 来源: 网络整理| 查看: 265

BackgroundHigh relapse rates remain a clinical challenge in the management of breast cancer (BC), with distant recurrence being a major driver of patient deterioration. To optimize the surveillance regimen for distant recurrence after neoadjuvant chemotherapy (NAC), we conducted a comprehensive analysis using bioinformatics and machine learning approaches.Materials and methodsMicroarray data were retrieved from the GEO database, and differential expression analysis was performed with the R package ‘Limma’. We used the Metascape tool for enrichment analyses, and ‘WGCNA’ was utilized to establish co-expression networks, selecting the soft threshold power with the ‘pickSoftThreshold’ algorithm. We integrated ten machine learning algorithms and 101 algorithm combinations to identify key genes associated with distant recurrence in BC. Unsupervised clustering was performed with the R package ‘ConsensusCluster Plus’. To further screen the key gene signature of residual cancer burden (RCB), multiple knockdown studies were analyzed with the Genetic Perturbation Similarity Analysis (GPSA) database. Single-cell RNA sequencing (scRNA-seq) analysis was conducted through the Tumour Immune Single-cell Hub (TISCH) database, and the XSum algorithm was used to screen candidate small molecule drugs based on the Connectivity Map (CMAP) database. Molecular docking processes were conducted using Schrodinger software. GMT files containing gene sets associated with metabolism and senescence were obtained from GSEA MutSigDB database. The GSVA score for each gene set across diverse samples was computed using the ssGSEA function implemented in the GSVA package.ResultsOur analysis, which combined Limma, WGCNA, and machine learning approaches, identified 16 RCB-relevant gene signatures influencing distant recurrence-free survival (DRFS) in BC patients following NAC. We then screened GATA3 as the key gene signature of high RCB index using GPSA analysis. A novel molecular subtyping scheme was developed to divide patients into two clusters (C1 and C2) with different distant recurrence risks. This molecular subtyping scheme was found to be closely associated with tumor metabolism and cellular senescence. Patients in cluster C2 had a poorer DRFS than those in cluster C1 (HR: 4.04; 95% CI: 2.60–6.29; log-rank test p ; 0.0001). High GATA3 expression, high levels of resting mast cell infiltration, and a high proportion of estrogen receptor (ER)-positive patients contributed to better DRFS in cluster C1. We established a nomogram based on the N stage, RCB class, and molecular subtyping. The ROC curve for 5-year DRFS showed excellent predictive value (AUC=0.91, 95% CI: 0.95–0.86), with a C-index of 0.85 (95% CI: 0.81–0.90). Entinostat was identified as a potential small molecule compound to reverse high RCB after NAC. We also provided a comprehensive review of the EDCs exposures that potentially impact the effectiveness of NAC among BC patients.ConclusionThis study established a molecular classification scheme associated with tumor metabolism and cancer cell senescence to predict RCB and DRFS in BC patients after NAC. Furthermore, GATA3 was identified and validated as a key gene associated with BC recurrence.

中文翻译:

提出一种新的分子分型方案,用于预测与代谢和衰老密切相关的乳腺癌新辅助化疗后的远处无复发生存期

背景高复发率仍然是乳腺癌(BC)治疗的临床挑战,远处复发是患者病情恶化的主要驱动因素。为了优化新辅助化疗(NAC)后远处复发的监测方案,我们利用生物信息学和机器学习方法进行了综合分析。材料和方法从GEO数据库中检索微阵列数据,并使用R软件包'Limma进行差异表达分析'。我们使用Metascape工具进行富集分析,并利用“WGCNA”建立共表达网络,通过“pickSoftThreshold”算法选择软阈值能力。我们集成了 10 种机器学习算法和 101 种算法组合来识别与 BC 远处复发相关的关键基因。使用 R 包“ConsensusCluster Plus”执行无监督聚类。为了进一步筛选残留癌症负担(RCB)的关键基因特征,利用遗传扰动相似性分析(GPSA)数据库对多项敲低研究进行了分析。通过肿瘤免疫单细胞中心(TISCH)数据库进行单细胞RNA测序(scRNA-seq)分析,并基于连接图(CMAP)数据库使用XSum算法筛选候选小分子药物。使用薛定谔软件进行分子对接过程。包含与代谢和衰老相关的基因集的 GMT 文件从 GSEA MutSigDB 数据库获得。使用 GSVA 包中实现的 ssGSEA 函数计算不同样本中每个基因集的 GSVA 评分。结果我们的分析结合了 Limma、WGCNA 和机器学习方法,确定了 16 个影响远处无复发生存的 RCB 相关基因特征( NAC 后 BC 患者的 DRFS)。然后我们利用 GPSA 分析筛选出 GATA3 作为高 RCB 指数的关键基因特征。开发了一种新的分子分型方案,将患者分为具有不同远处复发风险的两个组(C1 和 C2)。人们发现这种分子亚型方案与肿瘤代谢和细胞衰老密切相关。C2 组患者的 DRFS 比 C1 组患者差(HR:4.04;95% CI:2.60-6.29;对数秩检验 p < 0.0001)。高 GATA3 表达、高水平的静息肥大细胞浸润以及高比例的雌激素受体 (ER) 阳性患者有助于 C1 簇中更好的 DRFS。我们根据 N 分期、RCB 类别和分子亚型建立了列线图。5 年 DRFS 的 ROC 曲线显示出优异的预测价值(AUC=0.91,95% CI:0.95-0.86),C 指数为 0.85(95% CI:0.81-0.90)。恩替司他被认为是一种潜在的小分子化合物,可逆转 NAC 后的高 RCB。我们还对可能影响 BC 患者 NAC 有效性的 EDC 暴露进行了全面审查。结论本研究建立了与肿瘤代谢和癌细胞衰老相关的分子分类方案,以预测 NAC 后 BC 患者的 RCB 和 DRFS。此外,GATA3 被鉴定并验证为与 BC 复发相关的关键基因。



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3